Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Add UUID tests and fix version 5 bits #1362

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

shlokkhemani
Copy link

@shlokkhemani shlokkhemani commented Dec 22, 2024

What this PR does

  1. Adds comprehensive test coverage for our stringToUuid function:
    • Checks UUID format (length, hyphen placement, regex matching).
    • Verifies version and variant bits (RFC 4122 compliance).
    • Tests various inputs (numbers, Unicode, empty strings, edge cases).
    • Ensures consistent, deterministic output.
  2. Fixes the version bits to properly label the UUID as “version 5”:
    • Previously, we weren’t strictly setting the top nibble in hashBuffer[6] to 0x5.
    • This change enforces RFC 4122 compliance by setting (hashBuffer[6] & 0x0f) | 0x50.

Why this matters

  • Correct RFC 4122 compliance: We now correctly produce v5 UUIDs.
  • Potentially breaking: The new code produces different UUID strings for the same input (re-keys everything). Any stored references or comparisons to old UUIDs will no longer match.

Risks & Impact

  • Risk: Existing code/data relying on old UUIDs may break or create duplicates.
  • Mitigation: We can choose to:
    1. Merge only the tests for now, keeping the existing UUID generation.
    2. Merge the bit fix (and re-key), but accept the breaking change or plan a migration.

Next Steps

  1. (Option A) Merge only the tests and skip the version-bit fix for now (to avoid re-keying).
  2. (Option B) Merge both tests and fix, but coordinate any needed data migration or accept re-keying.

@shlokkhemani shlokkhemani changed the base branch from main to develop December 22, 2024 08:11
@odilitime
Copy link
Collaborator

Wow, this is going to rekey everything. Can you say more about the bit change, why is this more correct, what problems could we see if we don't fix it?

@shlokkhemani
Copy link
Author

I discovered the issue while writing tests for the UUID function. Our generated UUIDs weren't strictly setting the version and variant bits as per RFC 4122, which could cause inconsistencies or invalid v5 UUIDs. This fix ensures proper compliance and interoperability, though it does change UUIDs previously generated for the same input.

If we decide we’d rather avoid re-keying right now, I can open a separate PR that includes only the new tests, excluding this bit-fix.

@odilitime
Copy link
Collaborator

IMO better to do it now than later, I just wanted to make sure we had all the relevant info at hand to justify the decision.

@shlokkhemani
Copy link
Author

I’ve just updated the PR description to reflect the impact of the version-bit fix. I propose splitting these changes into two PRs:

  1. Test Coverage Only — merges the new test suite without altering how UUIDs are generated.
  2. Version-Bit Fix — applies the change that correctly sets the version 5 bits, which would re-key existing data.

Since the second part can be disruptive, I’d appreciate guidance or help on how best to handle existing data or references. Let me know your thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants